-
Notifications
You must be signed in to change notification settings - Fork 8.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Fleet] Don't attempt package installation while another installation might still be running #84190
Conversation
3ebb4bb
to
e0aad6c
Compare
Pinging @elastic/ingest-management (Feature:Fleet) |
@neptunian and @jonathan-buttner I would be very interested to hear if you think this change is worth it at all. |
Hmm, @skh do you think returning a partial list of the assets is helpful to the requester? I wonder if it'd be better to return some status code that indicates that an installation request is already running? Or that the service is busy and the requester should try again later. That way it could try, then wait for a bit and try again and receive the complete list of assets. |
Good point. I was thinking of how the UI currently uses the return value, and by returning just a list, albeit a shorter or empty one, the UI doesn't need additional changes for what should be a rare edge case. |
Gotcha, the Security Solution app calls the bulk upgrade endpoint which uses this code path I believe. We don't check the return value so we're fine there. I still think it might be worth refactoring this in the future because other apps might call the apis frequently which could cause the issue to happen more often. |
@jonathan-buttner Thanks! I've opened #84656 and #84651 as follow-up issues. |
💚 Build SucceededMetrics [docs]
History
To update your PR or re-run it, just comment with: |
* master: (63 commits) Revert the Revert of "[Alerting] renames Resolved action group to Recovered (elastic#84123)" (elastic#84662) declare kbn/monaco dependency on kbn/i18n explicitly (elastic#84660) Remove unscripted fields from sample data index-pattern saved objects (elastic#84659) [ML] Fix unnecessary trigger of wildcard field type search for ML plugin routes. (elastic#84605) Update create.asciidoc (elastic#84046) [Security Solution][Detections] Fix labels and issue with mandatory fields (elastic#84525) Fix flaky test suite (elastic#84602) [Security Solution] [Detections] Create a 'partial failure' status for rules (elastic#84293) Revert "[Alerting] renames Resolved action group to Recovered (elastic#84123)" Update code-comments describing babel plugins (elastic#84622) [Security Solution] [Cases] Cypress for case connector selector options (elastic#80745) [Discover] Unskip doc table tests (elastic#84564) [Lens] (Accessibility) Improve landmarks in Lens (elastic#84511) [Lens] (Accessibility) Focus mistakenly stops on righthand form (elastic#84519) Return early when parallel install process detected (elastic#84190) [Security Solution][Detections] Support arrays in event fields for Severity/Risk overrides (elastic#83723) [Security Solution][Detections] Fix grammatical error in validation message for threshold field in "Create new rule" -> "Define rule" (elastic#84490) [Fleet] Update agent details page (elastic#84434) adding documentation of use of NODE_EXTRA_CA_CERTS env var (elastic#84578) [Search] Integrate "Send to background" UI with session service (elastic#83073) ...
Summary
Implements #75810
This changes
_install_package()
to return early when it detects that the package might be in the process of being installed by another process.It does so by checking the
install_status
andinstall_started_at
fields on theepm-packages
saved object. If the previous installation was started longer thanMAX_TIME_COMPLETE_INSTALL
(currently, 60 seconds) ago, the installation will be attempted again.MAX_TIME_COMPLETE_INSTALL
is also used byensurePackagesCompletedInstall()
to decide whether to reattempt package installations. This should not break, or be broken by, the change made in this PR.How to test this
I've created https://gist.github.com/skh/cc695952031c9e349874b898c7066e42 to test this locally. On my system, with a local ES instance
Fixing this is beyond the scope of this PR.
UPDATE: after discussing with the Kibana Core team, the
409
error above happens when the saved object version has changed between a read and a subsequent update operation. We can useSavedObjectsClient.errors.isConflictError()
to catch this specific case, and repeat the check forinstalled_status
andinstall_started_at
.Follow-up issues: #84651 and #84656